Data pulled from Morningstar Direct on Jan. 22, 2020. Files combined using combine_files.r. Data in M* filtered:
56,284 rows, 36 columns
Added column Obsolete to indicate if share class is in existence or obsolete. I assume a blank Obsolete..Date means the share class still exists.
Also added Inception_Year and Obsolete_Year because these variables are used below.
Data filtered to remove:
48,520 rows, 41 columns
library(tidyverse) #For Data Analysis
library(lubridate) #For working with dates
library(DT) #For visualizing tables
Full <- read_csv("data_combined.csv",
guess_max = 20000) %>%
rename_all(make.names) %>%
mutate(
Inception_Year = year(Inception..Date),
Obsolete_Year = year(Obsolete..Date),
Obsolete =
case_when(
!is.na(Obsolete..Date) ~ "Obsolete",
TRUE ~ "Exists"
)) %>%
filter(Share.Class.Type != "Load Waived" |
is.na(Share.Class.Type)) %>%
filter((Inception_Year != Obsolete_Year) |
is.na(Obsolete_Year) |
is.na(Inception_Year)) %>%
filter(Inception_Year != 2021 |
Obsolete_Year != 2021)
## Warning: Removed 83 rows containing non-finite values (stat_count).
There was a large drop off in launches in 2020 compared to recent years. However it appears rationalization continued at roughly the same levels as recent years.
Of 48,520 fund share classes examined there are inception dates missing for 83.
#A function to count the number of share classes either created or liquidated each year
Year_Count <- function(colz){
Full %>%
group_by_at(colz) %>%
summarise(Count = n(), .groups = "drop") %>%
arrange(desc(.[[1]]))
}
inception_year <- Year_Count(colz = c("Inception_Year")) #Tallys share classes by year created
obsolete_year <- Year_Count(colz = c("Obsolete_Year")) #Tallys share classes by year liquidated
merge_type <- Year_Count(colz = c("Obsolete_Year", "Obsolete..Type")) #Tallys share classes by year liquidated and liquidation type
#Counts the net number of share classes created/cut
Net_Count <- full_join(inception_year, obsolete_year, by = c("Inception_Year" = "Obsolete_Year"),
suffix = c("_inception", "_obsolete")) %>%
rename(Year = Inception_Year) %>%
group_by(Year) %>%
mutate(Net_Count = sum(Count_inception, -Count_obsolete, na.rm = TRUE))
Obsolete_Month <- function(x){
Full %>%
filter(Obsolete_Year == x) %>%
mutate(Month = month(Obsolete..Date)) %>%
group_by(Month) %>%
count(name = paste("Obsolete_", x, sep = "") )
}
Years <- c("2018", "2019", "2020")
By_Month_Obsolete <- Years %>%
map(Obsolete_Month) %>%
reduce(full_join, by = "Month")
Share Class Liquidations & Mergers
Share Class Launches by Month
By_Month_Inception %>%
pivot_longer(
!Month,
names_to = "Year",
values_to = "Count"
) %>%
mutate(Year = str_replace(Year, "Inception_", "")) %>%
ggplot(data = . , aes(x = Month, y = Count, color = Year)) +
geom_line() +
ggtitle("Fund Launches") +
ylab("Count") +
xlab("Month") +
scale_x_continuous(breaks = seq(1, 12, by = 1)) +
theme_classic() +
theme(plot.title = element_text(size = 11, face = "bold"))
## Warning: Removed 1 rows containing missing values (position_stack).
The data from Morningstar is at the share class level. This is a method to look at launches at the fund level.
#This looks at when a fund's oldest share class was created
#This assumes that the oldest share class's inception date is equal to the fund's creation date
New_Funds <- Full %>%
filter(Oldest..Share.Class == "Yes") %>%
group_by(Inception_Year, Index..Fund) %>%
summarise(Count = n(), .groups = "drop") %>%
mutate(
Pct_Change = round(
((Count/lag(Count) - 1) * 100),
1
)) %>%
arrange(desc(Inception_Year))
## Warning: Removed 1 row(s) containing missing values (geom_path).
The method above uses the inception date of the Oldest..Share.Class as a proxy for when a fund was created. This method uses FundId.
The oldest inception date related to a FundID is a proxy for when the fund was created. Or the oldest obsolete date related to a a FundID (with no active share classes) is likely when that fund was liquidated/merged.
There were some cases identified in the data where a FundID had no corresponding Oldest..Share.Class. This could be an error or signal that a fund’s oldest share class is not captured in the data from Morningstar.
The results for 2020 ended up equal using the method above.
This method shows there were 15,825 funds in 2020 vs 15,677 in 2019 – a change of 148 funds. This method shows there were 15,464 in 2018 – a change of 213 funds.
To look at which firms cut or added share classes I first clean up the company names with Branding.Names.Mod.
company_change <- Full %>%
mutate(
Branding.Name.Mod =
case_when(
grepl("State Street", Branding.Name) ~ "State Street",
grepl("TIAA", Branding.Name) ~ "TIAA/Nuveen",
grepl("Nuveen", Branding.Name) ~ "TIAA/Nuveen",
grepl("Eaton Vance", Branding.Name) ~ "Eaton Vance/Calvert",
grepl("Calvert", Branding.Name) ~ "Eaton Vance/Calvert",
grepl("iShares", Branding.Name) ~ "iShares/BlackRock",
grepl("BlackRock", Branding.Name) ~ "iShares/BlackRock",
grepl("PowerShares", Branding.Name) ~ "PowerShares/Invesco",
grepl("Invesco", Branding.Name) ~ "PowerShares/Invesco",
grepl("DWS$", Branding.Name) ~ "DWS/Xtrackers",
grepl("Xtrackers", Branding.Name) ~ "DWS/Xtrackers",
TRUE ~ Branding.Name))
company_count <- function(x, colz){
company_change %>%
group_by_at(colz) %>%
summarise(Count = n()) %>%
ungroup() %>%
arrange(desc(.[[2]]))
}
comp_inception <- company_count(colz = c("Branding.Name.Mod", "Inception_Year"))
## `summarise()` regrouping output by 'Branding.Name.Mod' (override with `.groups` argument)
comp_obsolete <- company_count(colz = c("Branding.Name.Mod", "Obsolete_Year"))
## `summarise()` regrouping output by 'Branding.Name.Mod' (override with `.groups` argument)
comp_new_funds <- company_count(colz = c("Branding.Name.Mod", "Inception_Year", "Oldest..Share.Class")) %>%
filter(Oldest..Share.Class == "Yes")
## `summarise()` regrouping output by 'Branding.Name.Mod', 'Inception_Year' (override with `.groups` argument)
comp_old_funds <- company_count(colz = c("Branding.Name.Mod", "Obsolete_Year", "Oldest..Share.Class")) %>%
filter(Oldest..Share.Class == "Yes")
## `summarise()` regrouping output by 'Branding.Name.Mod', 'Obsolete_Year' (override with `.groups` argument)
comp_rank <- function(x){
x %>% filter(.[[2]] == "2020") %>%
mutate(Rank = rank(-Count)) %>%
filter(Rank <=20) %>%
arrange(Rank)
}
top_cutters <- comp_rank(comp_obsolete)
top_old_cutters <- comp_rank(comp_old_funds)
top_launchers <- comp_rank(comp_inception)
top_new_launchers <- comp_rank(comp_new_funds)
(Share Class)
(Oldest Share Class)
(Share Class)
(Oldest Share Class)
I